fix(async-op): return 404 when bank doesn't exist instead of raw FK 500#2352
Merged
Conversation
`_submit_async_operation` always INSERTs into `async_operations`, which has an FK to `banks(bank_id)`. Callers that race against bank deletion, or that derive bank IDs before creating the bank (an integration that submits `/consolidate` on a freshly-named bank before its CREATE has been issued), hit `asyncpg.exceptions.ForeignKeyViolationError` out of the INSERT. The FastAPI endpoints' generic `except Exception` then surfaces it as an opaque 500 — but the root cause is a client misuse, not a server fault. Add a bank-existence precheck at the top of the INSERT path in both branches: - `dedupe_by_bank=True` already runs `SELECT 1 FROM banks WHERE bank_id = $1 FOR NO KEY UPDATE` (for serialization, issue #1842). Switch it from `execute` to `fetchval` so the rowcount also gates existence — preserves the lock semantics, just adds a check on the returned value. - `dedupe_by_bank=False` (scoped submits) previously had no lock and no check; add a plain `SELECT 1 FROM banks` for existence only. When the bank is missing, raise `OperationValidationError(404)`. The endpoint's existing `except OperationValidationError` clause already converts that to `HTTPException(status_code=e.status_code, detail=e.reason)` — no API-layer changes needed. Tests: - 2 regression tests for `submit_async_consolidation` (unscoped + scoped) against a missing bank — assert OperationValidationError with status_code=404. - 1 pin test for `submit_async_graph_maintenance`, which has its own pre-INSERT short-circuit (empty queue → no_work=True) that already avoided the FK error. Verified that the existing dedup atomicity tests (`test_consolidation_submit_atomic_dedup.py`, `test_consolidation_retry_dedup_by_bank.py`) still pass — the lock semantics on the dedupe branch are unchanged.
nicoloboschi
approved these changes
Jun 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_submit_async_operation(the helper behindsubmit_async_consolidation,submit_async_retain,submit_async_file_retain, andsubmit_async_refresh_mental_model) always INSERTs intoasync_operations,which has an FK to
banks(bank_id). If a caller submits for a bank thatdoesn't exist — race against a concurrent bank deletion, or an integration
that derives bank IDs and submits before the bank's
CREATEhas beenissued — the INSERT raises
asyncpg.exceptions.ForeignKeyViolationError.The FastAPI endpoints wrap submit calls in a generic
except Exceptionandreturn HTTP 500 with the FK violation in the detail. This is a client-side
error misclassified as a server fault. It pollutes 5xx error-rate metrics
with what should be 404s, and exposes asyncpg internals to API consumers.
Fix
Add a bank-existence precheck at the top of the INSERT path in both branches
of
_submit_async_operation:dedupe_by_bank=Truealready runsSELECT 1 FROM banks WHERE bank_id = $1 FOR NO KEY UPDATEforserialization (issue Consolidation: large backlogs don't drain to completion; concurrency decays to ~1 despite WORKER_CONSOLIDATION_MAX_SLOTS #1842). Switch it from
conn.execute()toconn.fetchval()so the returned value gates existence — same SQL,same lock semantics, but the rowcount becomes observable. If
bank_exists is None, raiseOperationValidationError(404).dedupe_by_bank=False(scoped consolidation submits, etc.) had nolock and no check; add a plain
SELECT 1 FROM banksfor existence only.Same
OperationValidationError(404)on miss.The FastAPI endpoints' existing handler already converts
OperationValidationErrortoHTTPException(status_code=e.status_code, detail=e.reason)— no API-layer changes needed.Test plan
hindsight-api-slim/tests/test_async_submit_bank_not_found.py:test_consolidation_submit_on_missing_bank_raises_validation_error—unscoped submit on a non-existent bank asserts
OperationValidationErrorwith
status_code == 404and bank ID in the reason.test_scoped_consolidation_submit_on_missing_bank_raises_validation_error—same but with
observation_scopesset (takes thededupe_by_bank=Falsebranch).
test_graph_maintenance_on_missing_bank_short_circuits— pins thepre-existing behaviour that
submit_async_graph_maintenancereturns{"operation_id": None, "no_work": True}before reaching the INSERT(its own queue precheck already covers the FK case).
Regression sweep on adjacent suites — 59/59 passing:
test_async_submit_bank_not_found.py(new) — 3test_consolidation_submit_atomic_dedup.py— verifies dedup lock semantics(lock behaviour was the only thing I worried about touching). All 4 pass.
test_consolidation_retry_dedup_by_bank.pytest_consolidation_silent_requeue_failure.pytest_consolidation_scope_parallelism.pytest_op_cancellation.pytest_async_batch_retain.pytest_async_retain_tags.pyruff check+ruff formatclean. Pre-commit hooks pass.Compatibility
FOR NO KEY UPDATElock is unchanged.FOR NO KEY UPDATE->FOR UPDATErewrite isidentical, and the existence check is a plain
SELECT 1.Callers that were relying on the 500 (unlikely — it surfaced asyncpg
internals) should already be retrying or surfacing the error.